20 research outputs found

    Deep learning models for predicting RNA degradation via dual crowdsourcing

    Get PDF
    Medicines based on messenger RNA (mRNA) hold immense potential, as evidenced by their rapid deployment as COVID-19 vaccines. However, worldwide distribution of mRNA molecules has been limited by their thermostability, which is fundamentally limited by the intrinsic instability of RNA molecules to a chemical degradation reaction called in-line hydrolysis. Predicting the degradation of an RNA molecule is a key task in designing more stable RNA-based therapeutics. Here, we describe a crowdsourced machine learning competition (‘Stanford OpenVaccine’) on Kaggle, involving single-nucleotide resolution measurements on 6,043 diverse 102–130-nucleotide RNA constructs that were themselves solicited through crowdsourcing on the RNA design platform Eterna. The entire experiment was completed in less than 6 months, and 41% of nucleotide-level predictions from the winning model were within experimental error of the ground truth measurement. Furthermore, these models generalized to blindly predicting orthogonal degradation data on much longer mRNA molecules (504–1,588 nucleotides) with improved accuracy compared with previously published models. These results indicate that such models can represent in-line hydrolysis with excellent accuracy, supporting their use for designing stabilized messenger RNAs. The integration of two crowdsourcing platforms, one for dataset creation and another for machine learning, may be fruitful for other urgent problems that demand scientific discovery on rapid timescales

    Deep learning models for predicting RNA degradation via dual crowdsourcing

    Get PDF
    Messenger RNA-based medicines hold immense potential, as evidenced by their rapid deployment as COVID-19 vaccines. However, worldwide distribution of mRNA molecules has been limited by their thermostability, which is fundamentally limited by the intrinsic instability of RNA molecules to a chemical degradation reaction called in-line hydrolysis. Predicting the degradation of an RNA molecule is a key task in designing more stable RNA-based therapeutics. Here, we describe a crowdsourced machine learning competition ("Stanford OpenVaccine") on Kaggle, involving single-nucleotide resolution measurements on 6043 102-130-nucleotide diverse RNA constructs that were themselves solicited through crowdsourcing on the RNA design platform Eterna. The entire experiment was completed in less than 6 months, and 41% of nucleotide-level predictions from the winning model were within experimental error of the ground truth measurement. Furthermore, these models generalized to blindly predicting orthogonal degradation data on much longer mRNA molecules (504-1588 nucleotides) with improved accuracy compared to previously published models. Top teams integrated natural language processing architectures and data augmentation techniques with predictions from previous dynamic programming models for RNA secondary structure. These results indicate that such models are capable of representing in-line hydrolysis with excellent accuracy, supporting their use for designing stabilized messenger RNAs. The integration of two crowdsourcing platforms, one for data set creation and another for machine learning, may be fruitful for other urgent problems that demand scientific discovery on rapid timescales

    Crosslinking of a Peritrophic Matrix Protein Protects Gut Epithelia from Bacterial Exotoxins.

    No full text
    Transglutaminase (TG) catalyzes protein-protein crosslinking, which has important and diverse roles in vertebrates and invertebrates. Here we demonstrate that Drosophila TG crosslinks drosocrystallin, a peritrophic matrix protein, to form a stable fiber structure on the gut peritrophic matrix. RNA interference (RNAi) of the TG gene was highly lethal in flies and induced apoptosis of gut epithelial cells after oral infection with Pseudomonas entomophila. Moreover, AprA, a metalloprotease secreted by P. entomophila, digested non-crosslinked drosocrystallin fibers, but not drosocrystallin fibers crosslinked by TG. In vitro experiments using recombinant drosocrystallin and monalysin proteins demonstrated that monalysin, a pore-forming exotoxin of P. entomophila, was adsorbed on the crosslinked drosocrystallin fibers in the presence of P. entomophila culture supernatant. In addition, gut-specific TG-RNAi flies had a shorter lifespan than control flies after ingesting P. entomophila, whereas the lifespan after ingesting AprA-knockout P. entomophila was at control levels. We conclude that drosocrystallin fibers crosslinked by TG, but not non-crosslinked drosocrystallin fibers, form an important physical barrier against exotoxins of invading pathogenic microbes

    Information Extraction from Public Meeting Articles

    No full text
    Public meeting articles are the key to understanding the history of public opinion and public sphere in Australia. Information extraction from public meeting articles can obtain new insights into Australian history. In this paper, we create an information extraction dataset in the public meeting domain. We manually annotate the date and time, place, purpose, people who requested the meeting, people who convened the meeting, and people who were convened of 1258 public meeting articles. We further present an information extraction system, which formulates information extraction from public meeting articles as a machine reading comprehension task. Experiments indicate that our system can achieve an F1 score of 74.98% for information extraction from public meeting articles

    TG-dependent protection against <i>P</i>. <i>entomophila</i> infection in the gut.

    No full text
    <p>(A) Survival analysis of gut-specific <i>TG-</i>RNAi flies (<i>NP1>TG IR</i>) and their counterparts (<i>NP1>+</i>) upon oral infection with <i>P</i>. <i>entomophila</i> (<i>Pe</i>) or <i>Ecc15</i>. Statistical analysis was performed using a log-rank test. At least 50 flies were used. N.S., not significant. (B) Cell-death was quantified by propidium iodide staining. Results represent the percentage of dead cells (propidium iodide-positive nuclei) in the midguts of flies infected for 4 h with <i>P</i>. <i>entomophila</i> (<i>Pe</i>). Results represent the mean of 10 independent experiments. Statistical analysis was performed by one-way analysis of variance followed by Bonferroni correction for multiple comparisons to evaluate the pairwise difference. UC, unchallenged. (C) A schematic model of the TG-mediated peritrophic matrix formation. TG crosslinks drosocrystallin (Dcy) on the peritrophic matrix (PM). Crosslinked drosocrystallin is not digested by AprA, and the crosslinked drosocrystallin strengthens the peritrophic matrix to function as a physical barrier against exotoxins of pathogenic microbes.</p

    TG-dependent polymerization of drosocrystallin <i>in vitro</i> and <i>in vivo</i>.

    No full text
    <p><b>(</b>A) Wild-type drosocrystallin (WT) or the KR mutant (KR) was incubated with TG, and subjected to SDS-PAGE in 10% TGX FastCast gels (Bio-Rad Laboratories). These recombinants were detected by Western blotting with a horseradish peroxidase-conjugated anti-6 × His tag antibody. Monodansylcadaverine (MDC) was used as an inhibitor of protein-protein crosslinking. Data are representative of at least three independent experiments. <b>(</b>B) Gut extracts from systemic <i>TG-</i>RNAi flies (<i>Da>TG IR</i>) and their counterparts (<i>Da>+</i>) were subjected to SDS-PAGE in 12% slab gels. Native drosocrystallin in the extracts was detected by Western blotting with an anti-drosocrystallin antibody (Anti-Dcy). Asterisks indicate unknown cross-reacted proteins. Data are representative of three independent experiments.</p

    Polymerized drosocrystallin protects against AprA.

    No full text
    <p>(A) Wild-type drosocrystallin (WT) was incubated at 37°C for 30 min with or without TG, and then the culture supernatant from <i>P</i>. <i>entomophila</i> (<i>Pe</i>) was added, and the mixture was subjected to SDS-PAGE in 10% TGX FastCast gels. Open arrowhead, the monomeric recombinant; closed arrowhead, the crosslinked recombinant. Data are representative of at least three independent experiments. (B) Wild-type drosocrystallin was subjected to SDS-PAGE in 10% slab gels and detected by Western blotting after incubating with each fraction obtained by gel filtration of the culture supernatant from <i>P</i>. <i>entomophila</i>. (C) Fraction No. 27 from the gel filtration was subjected to SDS-PAGE in 15% slab gels, and proteases in this fraction were identified by liquid chromatography tandem mass spectroscopy analysis. (D) Wild-type drosocrystallin was incubated with purified AprA at 25°C or 37°C and analyzed by SDS-PAGE in 10% slab gels, and detected by Western blotting using anti-6 × His tag antibody (upper panel). Western blotting data are representative of four independent experiments. The relative intensity of each band compared to that of the untreated protein (0 min) was calculated using ImageJ software (lower panel). (E) Wild-type drosocrystallin was incubated with the culture supernatant from <i>P</i>. <i>entomophila</i> (<i>Pe</i>) or the <i>AprA</i>-knockout strain (<i>Pe</i><sup><i>ΔaprA</i></sup>), analyzed by SDS-PAGE in 10% slab gels, and detected by Western blotting using anti-6 × His tag antibody. Western blotting data are representative of three independent experiments (upper panel). The relative intensity of each band compared to that of the untreated protein (0 min) was calculated using ImageJ software (lower panel).</p

    TG-dependent incorporation of monodansylcadaverine (MDC) or biotin pentylamine into drosocrystallin recombinants.

    No full text
    <p>(A) Wild-type drosocrystallin (WT), the KR mutant (KR), or bovine serum albumin (BSA) was incubated with chitin, and each fraction (Fr) was analyzed by SDS-PAGE in 10% slab gels (left panel). The intensity of each fraction relative to that of the input was summed as the bound fraction. Bars indicate the mean and standard deviations of experiments performed in triplicate (right panel). BSA was used as a negative control. Open bars, unbound fraction; closed bars, bound fraction. (B) Wild-type drosocrystallin (WT) or the KR mutant (KR) was incubated with MDC in the presence of TG, and analyzed by SDS-PAGE in 10% slab gels. The proteins were stained with Coomassie brilliant blue (CBB), and the MDC-incorporated protein was detected by the emission intensity of the dansyl group. Data are representative of three independent experiments (left panel). The relative emission intensity of each fraction compared to that of CBB-stained protein was calculated using ImageJ software (right panel). (C) Wild-type drosocrystallin (WT) or the KR mutant (KR) was incubated with or without biotin pentylamine (BPA) in the presence of TG, and subjected to SDS-PAGE in 10% slab gels. Incorporation of BPA was detected with horseradish peroxidase-conjugated streptavidin. Loaded recombinant proteins were detected by Western blotting with a horseradish peroxidase-conjugated anti-6× His tag antibody. Data are representative of at least three independent experiments.</p
    corecore